Language Extensions and Compilation Techniques for Data Intensive Computations

نویسندگان

  • Gagan Agrawal
  • Renato Ferreira
  • Joel Saltz
چکیده

Processing and analyzing large volumes of data plays an increasingly important role in many domains of scienti c research. Typical examples of very large scienti c datasets include long running simulations of time-dependent phenomena that periodically generate snapshots of their state, archives of raw and processed remote sensing data, and archives of medical images. High-level language and compiler support for developing applications that analyze and process such datasets has, however, been lacking so far. We are developing language extensions and a compilation framework for expressing the applications that process large multidimensional datasets in a high-level data-parallel fashion. We have chosen a dialect of Java for expressing these applications. Our dialect of Java includes data-parallel extensions for specifying collection of objects, a parallel for loop, and reduction variables. Our compiler will analyze parallel loops and optimize the processing of datasets through the use of an existing runtime system, called Active Data Repository (ADR), developed at University of Maryland. We present design of a compiler/runtime interface which allows the compiler to e ectively utilize the existing runtime system. We show how interprocedural static program slicing can be used by the compiler to extract relevant information for the runtime system. Implementation of these compiler techniques is currently underway using the Titanium infrastructure.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High Level Programming Methodologies for Data Intensive Computations

Solving problems that have large computational and storage requirements is becoming increasingly critical for advances in many domains of science and engineering. By allowing algorithms for such problems to be programmed in widely used or rapidly emerging high-level paradigms, like object-oriented and declarative programming models, rapid prototyping and easy development of computational techni...

متن کامل

Optimization of query evaluation for multidimensional raster databases

Many interpreted languages suffer from having higher processing times mostly due to the overhead introduced by the ”virtual machine” abstraction layer. A typical situation where interpreters are much slower than compiled programs is when complex computations are needed. JIT (just-in-time) compilation techniques proved very successful in solving this problem however not many languages implement ...

متن کامل

HPF-2 Support for Dynamic Sparse Computations

There is a class of sparse matrix computations, such as direct solvers of systems of linear equations, that change the fill-in (nonzero entries) of the coefficient matrix, and involve row and column operations (pivoting). This paper addresses the problem of the parallelization of these sparse computations from the point of view of the parallel language and the compiler. Dynamic data structures ...

متن کامل

Vienna-Fortran/HPF Extensions for Sparse and Irregular Problems and Their Compilation

Vienna Fortran, High Performance Fortran (HPF), and other data parallel languages have been introduced to allow the programming of massively parallel distributed-memory machines (DMMP) at a relatively high level of abstraction, based on the SPMD paradigm. Their main features include directives to express the distribution of data and computations across the processors of a machine. In this paper...

متن کامل

A Backend Extension Mechanism for PQL/Java with Free Run-Time Optimisation

In many data processing tasks, declarative query programming offers substantial benefit over manual data analysis: the query processors found in declarative systems can use powerful algorithms such as query planning to choose high-level execution strategies during compilation. However, the principal downside of such languages is that their primitives must be carefully curated, to allow the quer...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000